Named Entity Recognition and Classification in Historical Documents: A Survey
نویسندگان
چکیده
After decades of massive digitisation, an unprecedented amount historical documents is available in digital format, along with their machine-readable texts. While this represents a major step forward respect to preservation and accessibility, it also opens up new opportunities terms content mining the next fundamental challenge develop appropriate technologies efficiently search, retrieve explore information from ‘big data past’. Among semantic indexing opportunities, recognition classification named entities are great demand among humanities scholars. Yet, entity (NER) systems heavily challenged diverse, noisy inputs. In survey, we present array challenges posed by NER, inventory existing resources, describe main approaches deployed so far, identify key priorities for future developments.
منابع مشابه
A survey of named entity recognition and classification
The term “Named Entity”, now widely used in Natural Language Processing, was coined for the Sixth Message Understanding Conference (MUC-6) (R. Grishman & Sundheim 1996). At that time, MUC was focusing on Information Extraction (IE) tasks where structured information of company activities and defense related activities is extracted from unstructured text, such as newspaper articles. In defining ...
متن کاملA Survey of Arabic Named Entity Recognition and Classification
As more and more Arabic textual information becomes available through the Web in homes and businesses, via Internet and Intranet services, there is an urgent need for technologies and tools to process the relevant information. Named Entity Recognition (NER) is an Information Extraction task that has become an integral part of many other Natural Language Processing (NLP) tasks, such as Machine T...
متن کاملNamed Entity Recognition in Vietnamese documents
Named Entity Recognition (NER) aims to classify words in a document into pre-defined target entity classes and is now considered to be fundamental for many natural language processing tasks such as information retrieval, machine translation, information extraction and question answering. This paper presents the results of an experiment in which a Support Vector Machine (SVM) based NER model is ...
متن کاملNamed Entity Recognition for Digitised Historical Texts
We describe and evaluate a prototype system for recognising person and place names in digitised records of British parliamentary proceedings from the late 17th and early 19th centuries. The output of an OCR engine is the input for our system and we describe certain issues and errors in this data and discuss the methods we have used to overcome the problems. We describe our rule-based named enti...
متن کاملNamed Entity Recognition in Persian Text using Deep Learning
Named entities recognition is a fundamental task in the field of natural language processing. It is also known as a subset of information extraction. The process of recognizing named entities aims at finding proper nouns in the text and classifying them into predetermined classes such as names of people, organizations, and places. In this paper, we propose a named entity recognizer which benefi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: ACM Computing Surveys
سال: 2023
ISSN: ['0360-0300', '1557-7341']
DOI: https://doi.org/10.1145/3604931